Hierarchical Clustering of Speakers into Accents with the Accdist Metric

نویسنده

  • Mark Huckvale
چکیده

Hierarchical clustering of speakers by their pronunciation patterns could be a useful technique for the discovery of accents and the relationships between accents and sociological variables. However it is first necessary to ensure that the clustering is not influenced by the physical characteristics of the speakers. In this study a number of approaches to agglomerative hierarchical clustering of 275 speakers from 14 regional accent groups of the British Isles are formally evaluated. The ACCDIST metric is shown to have superior performance both in terms of accent purity in the cluster tree and in terms of the interpretability of the higher-levels of the tree. Although operating from robust spectral envelope features, the ACCDIST measure also showed the least sensitivity to speaker gender. The conclusion is that, if performed with care, hierarchical clustering could be a useful technique for discovery of accent groups from the bottom up.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ACCDIST: a metric for comparing speakers' accents

This paper introduces a new metric for the quantitative assessment of the similarity of speakers' accents. The ACCDIST metric is based on the correlation of inter-segment distance tables across speakers or groups. Basing the metric on segment similarity within a speaker ensures that it is sensitive to the speaker’s pronunciation system rather than to his or her voice characteristics. The metric...

متن کامل

Computer and Human Recognition of Regional Accents of British English

This paper is concerned with classification of the 14 regional accents of British English in the ABI (Accents of the British Isles) speech corpus. Results are reported using a state-of-theart Language Identification system, variants of Huckvale’s ACCDIST system, and human listeners. The best performance, 95.18% accuracy, is obtained using the textdependent ACCDIST measure. The performance of a ...

متن کامل

Cross Entropy Information Metric for Quantification and Cluster Analysis of Accents

This paper proposes a method for the measurement and quantification of the impact of accents on speech models. An accent metric is introduced based on the cross entropy (CE) of the probability models of speech from different accents. The CE metric has potentials for use in analysis, identification, quantification and ranking of the salient features of accents. The accent metric is used for phon...

متن کامل

Composite Kernel Optimization in Semi-Supervised Metric

Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...

متن کامل

Vowel systems and accent similarity in the British Isles: Exploiting multidimensional acoustic distances in phonetics

We illustrate how a high-dimension feature space typically used in speech technology can be adapted to the phonetic description of vowels in 13 accents of the British Isles. In a previous work (Ferragne & Pellegrino, 2010), we carried out a formant investigation of the vowel systems of the British Isles; due to erroneous formant estimation, two-thirds of the speakers had to be left out. The pre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007